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1 Introduction 


More and more complex datasets call for sophisticated statistical methods in the modern 
era. Compared with other helds for analyzing data such as computer science and applied 
mathematics, statistics can quantify the uncertainty of a phenomenon via hypothesis testing 
and/or interval estimation, which solidihes the unique feature of this discipline. In conven¬ 
tional frenquentist statistics, for testing a hypothesis or constructing a conhdence interval, 
we need to hnd proper test statistic or pivotal quantity whose distribution satishes certain 
properties (Lehmann and Romano 2006). However, this is quite difficult for many complex 
problems. The bootstrap method (Efron 1979) relaxes the above requirement on test statis¬ 
tics or pivotal quantities via its ability in distribution approximation, and thus strengthens 
the power of conventional frequentist inference. Another advantage of bootstrap is that 
it provides explicit resampling-based solutions if the underlying model is well estimated. 
Consequently, bootstrap has been well received in statistics and other helds. The frequen¬ 
tist properties of bootstrap inferential procedures such as the bootstrap interval estimation 
can be guaranteed by the consistency of bootstrap distribution estimation (Shao and Tu 
1995). This is also true for related methods like subsampling (Politis, Romano, and Wolf 
1999). Generally speaking, it is more difficult to prove such a consistency than to derive the 
asymptotic distribution of the corresponding test statistic or pivotal quantity. 

From the above discussion it can be seen that we have to do much theoretical work 
before claiming that the proposed method is a frequentist one. This is not easy for com¬ 
plex problems, and thus hampers the frequentist approach from being more applicable. In 
this paper we provide a very general approach based on local optimization to complement 
current frequentist inference. Our approach can be viewed as an extension of the classical 
bootstrap method, and reduces to it when the region for optimization shrinks to the centre. 
On the theoretical aspect, the tests and conhdence intervals constructed by our approach 
possess asymptotic frequentist properties as long as we have consistent estimators of un¬ 
known parameters. This feature indicates that we do not need to derive any (asymptotic) 
distribution or to prove the consistency of distribution estimation before using the proposed 
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approach. In addition, with a proper region for optimization, the proposed approach is hrst 
order asymptotically equivalent to the bootstrap method for regular problems. On the com¬ 
putational aspect, our approach only requires the optimal objective value of an optimization 
problem over a local region, which can be reached by standard optimization techniques. We 
also present simple experimental design-based algorithms including a neighborhood boot¬ 
strap method to solve the optimization problem. These algorithms are easy to implement 
for practitioners, and produce satisfactory results in our simulations. 

The rest of this paper is organized as follows. Sections|2]and[3]introduce local optimization- 
based hypothesis testing and interval estimation, respectively. Their asymptotic frequentist 
properties are studied in Section 01 Some implementation issues are discussed in Section O 
Section [6] presents four non-regular examples including a high-dimensional problem and a 
nonparametric regression problem to illustrate the proposed approach. We end the paper 
with some discussion in Section [7l 

2 Local optimization-based hypothesis testing 

Let the random sample X be drawn from a distribution F{-,6), where 6 lies in the 
parameter space 0. Here 0 can be a subset of an Euclidean space or an inhnite-dimensional 
space. We are interested in testing 

Hq : 6 G 00 ■G)' Hi : 0 G 0 \ 0o, (1) 

where 0o is a close subset of 0. Let T = T(X) G M be a test statistic. Suppose that T tends 
to take a large value when Ho does not hold. It is known that the p-value for testing ([T]) is 
dehned as 

F = supPr(T;^T|T), (2) 

where = T(X*) and X* is an independent copy of X from F{-, 0) (Fisher 1959). Given a 
signihcance level a G (0,1), we will reject Hq if P < a. This test can strictly control Type I 
error within the Neyman-Pearson framework, as shown in the following proposition. 
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Proposition 1. Under Hq, 


Pr(P < a) ^ a 


Proof. Let Gq denote the cumulative distribution function (c.d.f.) of —T, i.e., G{x) = 
Pr(—T ^ x). Denote 

G~^{t) = infja; : G{x) > t}. (3) 


For 6 E © 0 , we have 


Pr(P < a) = Pr I sup Pr(T^ ^ T | T) < a 
V060O / 

^ Pr(Pr(T; ^ T I T) < a) = Pr(G'(-T) < a) 

^Pt{-T <G-\a)) ^a. (4) 


This completes the proof. □ 

Proposition [1] is a general result, which does not requires any assumption on T. From 
Proposition [U a test is obtained by solving an stochastic optimization problem in ([2]), which 
can be rewritten as 

P = sup I{T{x) ^ t)dF{x,4>), (5) 

0600 J 

where I is the indicator function and t is the realization of T. In principle, any hypothesis 
testing problem can be solved by this way as long as the corresponding optimization problem 
in ([5]) is solvable. In limited trivial cases, the problem in ([5]) has obvious solution; an example 
is the one-sided Z-test. However, except for such cases, this method faces some difficulties in 
computation: the stochastic optimization problem is generally very hard to solve, especially 
when ©0 is an unbounded set. 

In statistical literature, a commonly used strategy to overcome these difficulties is based 
on the asymptotic distribution of the test statistic T. The optimization problem in ([5]) is often 
solvable when replacing the distribution of T by its asymptotic distribution. For example, 
with a T whose asymptotic distribution is free of unknown parameters, it is trivial to solve 
()S|1. For complex problems, it is often not easy to derive the asymptotic distribution, or to 
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find such a T whose asymptotic distribution has desirable properties. A Bayesian remedy is 
Meng (1994)’s posterior predictive p-value, which averages the objective function in (|5]) over 
the posterior distribution of the parameter under the null hypothesis. 

Here we provide a more general strategy without any requirement on the distribution of 
T. Suppose that Hq holds. For the true parameter 9 G ©o, it suffices to obtain a p-value that 
controls Type I error by optimizing the objective function in ([2]) over any set that contains 
9, instead of over the whole ©o; see the hrst inequality in (0]). Consequently, we need to 
compute 

Fq = max / liTix)t)dFix,4>), (6) 

</>eAA{6»)n0o J 

where ^{9) is a closed neighborhood of 9 containing 9. Here “sup” in (j5]) is replaced by 
“max” if we assume that A/'(6')n©o is a compact subset of ©o on which J I{T[x) ^ t)dF{x, 0) 
is continuous with respect to <p. In practice, we use a consistent estimator 9 of 9 under Hq 
to replace 0 in (|6]), and obtain 

Plot = max [ I{T{x) ^ t)dF{x,(j)). (7) 

<j)GAf{e)n0o J 

If the probability of 6* G Af{9) tends to one, then the test based on the p-value in ([7]) is 
asymptotically valid. We call this test local optimization-based test (LOT) throughout the 
paper. LOT only requires the maximum value of the objective function over a neighborhood 
of 9, which can be achieved by standard optimization techniques. This feature makes LOT 
work for many complex problems, in which it is hard to analyze the distribution of T. 

When J\f{9) shrinks to 9, ([7]) becomes 

P^ = J I{T{x)^t)dF{x,9), (8) 

which is the p-value of the bootstrap test (Davison and Hinkley 1997). Therefore, LOT 
can be viewed as an extension of the bootstrap test. LOT always controls Type I error 
asymptotically as long as 6^ is a consistent estimator, whereas the bootstrap test can fail 
for non-regular cases where the bootstrap distribution estimator is inconsistent (Bickel and 
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Ren 2001). From ([2]), ([7]), to ([8]), LOT is a bridge connecting Fisher’s significance test and 
Efron’s bootstrap test; see Tabled! 


Table 1: Comparison of three tests 



How to lie in Neyman-Pearson’s framework 

Difficulty level in implementation 

Fisher’s significance test 

always 

high 

LOT 

under weak conditions 

moderate 

Efron’s bootstrap test 

under strong conditions 

low 


3 Local optimization-based interval estimation 

The idea of approximating the p-value via local optimization can be modified to construct 
confidence intervals. Suppose that the parameter of interest is = ^{6) G M, and that 
^ = .^(X) is an estimator of Let Hg denote the c.d.f. of the pivotal quantity ^ i.e., 

Hg{x) = Pr(^ —^ x). It should be pointed out that the (asymptotic) distribution of ^ —^ is 

allowed to depend on unknown parameters, and this is different from the standard definition 
of a pivotal quantity in textbooks. Define as in ([3]). 

Proposition 2. For all 9 E Q and a E (0,1), 

Pr ^ | + supiL^^(l - o)^ ^ 1 - a, (9) 

Pi L;j^+mfffp(a)) (10) 

Proof. We have 

Pr ^ + sup - a) 

\ </.ee 

= He {Hf\l - a)) ^ 1 - a. (11) 

This completes the proof of (|9]), and that of ffTOj) is similar. □ 
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By Proposition [21 the upper and lower 1 — a confidence bounds of ^ are given by ^ + 
sup^ge — a) and ^ + inf^g© respectively. The equal-tailed 1 — a confidence 

interval of is inf^ge hf^^(a/2), ^ + sup^g 0 if^^(l — a/2)]. These interval limits all 
need to solve an optimization problem 

suphf^^(7) or infhf-^( 7 ) 

0 ee 

for some 7 G (0,1), which is often difficult. Like ([ 6 ]), Proposition |2] also holds if we take 
supremum over an arbitrary region containing the true value of 6] see the first inequality in 
(HU. Suppose that 6 * is a consistent estimator of 9. Under some mild conditions, we can get 
asymptotically valid confidence limits through solving 

sup or inf. (12) 

0 eAt(e) 

Specifically, the upper and lower 1 — a confidence bounds of ^ are ^ -|- sup^^^^g^ — a) 

and ^ iL/"^(a), respectively, and the equal-tailed 1 — a confidence interval of ^ is 

[^ + inf^gAr(e)^ 0 ^(«/ 2 ), I + sup^gat^ ^(1 “ “/2)] • Here “sup” (or “inf”) can be replaced 
by “max” (or “min”) if MiJ)) is a compact subset of 0 on which is continuous with 
respect to </>. We call these confidence intervals local optimization-based confidence intervals 
(LOCIs) throughout the paper. When Af{9) shrinks to 6, LOCIs become the bootstrap 
hybrid confidence intervals (Shao and Tu 1995). 

4 Asymptotic properties 

This section discusses asymptotic properties of the proposed local optimization-based 
methods. Further results involving some computational method are deferred in the Ap¬ 
pendix. Here we only consider one-sided LOCIs, and similar results also hold for two-sided 
LOCIs and LOTs. Some notation and definitions are needed. The parameter space 0 is 
assumed to be a metric space with metric p. For A C 0, let |A| denote max{p(a, 6) : 
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a,b & A\. For two c.d.f.’s Fi and F 2 , the Kolmogorov distance between them is dehned 
as dK{Fi,F 2 ) = sup^gjg |Fi(a:) — F 2 {x)\. We allow the neighborhood A/'(-) to depend on n 
and denote Afn{-) for clarity. We use )-d” to denote “converge in distribution”, and let 
“a.s.” be the abbreviation for “almost surely”. As in Section |3l let Hq denote the c.d.f. of 
i Since Afn{0) is a random set, for 0 G Afn{9), is actually a random c.d.f., i.e., 

H^{x) = Pr(e(0) - e(X*) ^ x|X), where the conditional distribution of X* conditional on 
X is F(-,0). 

Assumption 1. As n ^ 00 , Pr (^6 G Mn{9)^ 1 for all 6* G 0. 

If 9 is consistent, then Mnip) is easy to construct to satisfy Assumption [H see flT^ in 
Section 15.11 We can immediately have the following theorem. 

Theorem 1. Under AssumptionUl for all 9 ^ Q and a G (0,1), 


lim inf Pr 

n—>-oo 




sup H^^{1 

4>&Mr,(e) 



(13) 


We next show that LOCIs are hrst order asymptotically equivalent to the bootstrap 
conhdence intervals under regularity conditions. Specihcally, if the bootstrap distribution 
estimator of ^ is consistent, then in flT^ can be replaced by “=”. Several assumptions 
are needed. 


Assumption 2. As n ^ 00 , \Mn{9)\ —)■ 0 (a.s.). 

Assumption 3. As n ^ 00 , 9^9 (a.s.) for all 9 £ Q. 

Assumption 4. (i) There exists a series of numbers On ^ 00 such that Onif, — (,) — K, 
where K is a continuous c.d.f. and is strictly increasing on its support. 

(a) For (f) G J\fn{9), dK{ff(j),K) 0 (a.s.), where H(j,{x) = Pr(a„[,^(0) — ,^(X*)] ^ x| X) and 
X* is the bootstrap sample drawn from F{-,(j)). 

Assumption m indicates that the bootstrap distribution estimator of is consistent 

(Shao and Tu 1995). We can use the conditional distribution of an[f,{9) — ^(X*)] conditional 




on X to approximate that of an{^ and this approximation leads to asymptotically valid 
conhdence intervals for Assumption H] holds for general regular cases. We present two 
simple examples. 

Example 1. Let Xn be a random number from a binomial distribution BN(?7,, vr) with param¬ 
eter n e (0, 1). Consider the pivotal quantity n — Xn/n. It is clear that ^/n{^^ — Xn/n) K, 
where K is the c.d.f. of N{0,7 r(l — vr)). This result also holds for any strongly consistent es¬ 
timator Tin of 71. Specifically, with Xf ~ BN(n, Ttn), we can easily prove that dK{IIn, A') — )■ 0 
(a.s.) by the central limit theorem for triangle arrays, where Hn{x) = P7r(\/n(^n — X*/n) ^ 
x\ Xn), and then Assumption^ holds. 

Example 2. Let Xi, ..., Xn be i.i.d. random variables from a c.d.f. F with EXf < oo. 
Here we do not assume a parametric form for F. Then the parameter space 0 = {F G F : 
/ x^dF{x) < cxo} is an infinite-dimensional metric space with metric dx? where F denotes 
the set of all c.d.f. ’s on M. A strongly consistent estimator of F is the empirical distribution 
F{x) = ^ x)/n. Suppose that the parameter of interest is = EXi. Let Xn 

denote the sample mean. Consider the pivotal quantity /r— First, we have ^/n(p,—Xn) -^d 
<h( • /v{F)), where $ is the c.d.f. of N{0,1 ) and v{F) = J (x — J xdF{x))‘^dF{x). Second, 
take 

Un{F) = {G e 0 : dK{G,F) < |n(G) -n(F)| < 1/n^'^]. (14) 

It is easy to verify Assumptions IMM Furthermore, for Fn G Afn{F) and X*,... ,Xf i.i.d. 
from Fn, through verifying the Lindeberg condition in the central limit theorem for triangle 
arrays, we have that dKiSp^,^) —t 0 (a.s.), where Hp^{x) = Pr(y^(EX* — Xf)/v{Fn) ^ 
x\Xi,... ,Xn). Denote HfS^) = Ti{y/n{E XI-Xf) ^x\Xi,...,Xn). By 
dx {HFni'): ■ Z'aiF))) —)■ 0 (a.s.). Then Assumption^ holds. 

Theorem 2. Under AssumptionsUl^ for all 9 E Q and a G (0,1), 


Proof. For any n, there exists 6 ** G Mnip) such that sup^g^^^-g^ < HQ}{l—a) + l/n. 

Under Assumptions |2] and [3l 6 ** 9q (a.s.). Therefore, by Assumption HI — a) ^ 

K~^{1 — a) (a.s.). We have 

sup I =Pr ^ sup Hf^{l - a) 

\ ^peATnie) J \ 4>^Ar„{e) 

^ Pr - i) ^ -a) + 1/nj = Pr ^ A'"^(l - a) + o(l) j 

= He {K-\l -a) + o(l)) ^ 1 - a. 

Combining this result with Theorem [1], we complete the proof. □ 

When applying bootstrap to a specihc problem, we need to verihed Assumptions [3] and 
m to guarantee its frequentist properties. Theorems [H and [2] indicate that we do not need 
to do such theoretical work when using LOCI. With a proper Afn{9), LOCI possesses both 
the basic frequentist property in (lT3il and a potential bonus: it enjoys the same hrst order 
frequentist property as the bootstrap method when the two assumptions hold (although we 
may not know this). It can be expected that, under much stronger conditions, LOCI has 
some high-order asymptotic properties like bootstrap (Hall 1992). We do not discuss this 
here since it is difficult to specify Afn{9) satisfying such conditions for complex problems. 

5 Implementation 

This section discusses how to implement LOT and LOCI. We focus on the cases where 0 is 
a subset of an Euclidean space. Therefore, it suffices to solve hnite-dimensional optimization 
problems in LOT and LOCI. For some problems with inhnite-dimensional parameter spaces, 
LOT or LOCI is still available through rational simplihcation; see Section [6~31 

5.1 Specification of J\f{6) 

The hrst issue is to determine the neighborhood N'{9) in ([7j) and flT^ over which we 
solve the optimization problem. Suppose that the dimension of 0 is g and 9 = {9i,, 0^)' 
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is a consistent estimator of 6* = (6 *i,..., Og)'. The basic principle is to select Mi^O) satisfying 
Assumption [ 1 ] A simple choice of J^{0) is 

— 6, ^1 + (5j X • • • X ^9g — 6, Og (5j (15) 

for some small constant 5 > 0. If we know further the convergence rate of 6, then the second 
principle is to select ^/{9) satisfying Assumption [2l By Theorem O this selection can make 
the local optimization-based method asymptotically equivalent to bootstrap if the bootstrap 
distribution estimator is consistent. For example, with = Op{l/^/n), a selection of 

Af{9) simultaneously satisfying Assumptions [D and | 2 ] is 

[9i - 6log{n)/^/n, 9i + 6 \og{n)/^/^ x---x [9g - 6log{n)/^/n, 9g + 6log{n)/^/n\ (16) 

for some constant 5 > 0. The constant <5 in (lT5ll or (ITHll can be specihed empirically. For 
complex problems, the convergence rate of 9 is difficult to exactly know. We will see in 
Section | 6 ] that, LOT or LOCI has good hnite-sample performance even with a simple J^{9) 
like in flT^ that only satishes Assumption [H 

It seems more reasonable if the variances of 9j^s are used to construct J\f{9). When the 
variance estimators are not straightforward, the jackknife, bootstrap (Shao and Tu 1995), 
or even Bayesian methods can be used to estimate the variances. However, such methods 
will add extra theoretical and computational work, and there are still some constants, which 
need to be specihed empirically, in the hnal form of J^{9). Therefore, we suggest using the 
variance estimators only when they are straightforward. 

5.2 Importance sampling-based approach 

Suppose that F{-,9) has a probability density function (p.d.f.) f{-,9) with respect to 
a cr-hnite measure z/, and that {f{-,9) : (p G ©o} has a common support. We use an 
importance sampling-based approach to solve the stochastic optimization problem in (171) . 
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First we approximate the objective function in ([7]) by importance sampling. Note that 


u{(j)) = [ /(T(x) ^ = E I /(T(X*) ^ \ , 

where X* ~ f{-,6). According to the sample averaging approximation method in stochastic 
optimization (Shapiro 2003), we compute the p-value as 


Pis = max u{(j)), 
0eA^(0)neo 


(17) 


where 


m( 0 ) 



m=l 


nnx-j ^ t) 


/K^l 

/(x;..«) J ’ 


( 18 ) 


is the approximation of r{6) based on X^ • • • ,^*m i-i.d. from f{-,6) with the Monte Carlo 
sample size M. With sufficiently large M, Pis can be arbitrarily close to P in ([7]). There 
are many available iterative algorithms for solving the deterministic optimization problem 
in (ITTll such as the interior point method (Boyd and Vandenberghe 2004). 

We can also use an experimental design-based method to approximate the p-value in 
(na. Take L points 0i,..., 0^ uniformly spaced over MiO) fl ©o, and then compute 


Pis-D = max|M(0),M(0i),...,M(0i)|, (19) 

where f is dehned in flT8|) . We call these points try points throughout this paper, which can be 
constructed from so-called space-filling designs in experimental design; see Section [5^ Since 
N'iO) is a small neighborhood, Pis-d often performs well with a moderate L. The design- 
based method is very easy to implement, and is suitable for those who are not familiar with 
optimization methods. More sophisticated space-hlling design-based optimization method 
can be found in Fang, Hickernell, and Winker (1996). 

For LOCI, we have the following importance sampling-based method to compute the 
interval limits when F{-,9) has a p.d.f. f{-,9) and '■ 9 € 0} has a common support. 
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Here we only consider the computation of upper limits, i.e., the hrst optimization problem 
in ([ 12 ]). Let and S{(f),ip) = Suppose that is continuous and strictly 

increasing on its support for (j) G J\f{6). The problem flT^ is equivalent to the constrained 
optimization problem 


max ip subject to S{(j),ip) = 7 . (20) 

For g-dimensional space 0, the problem optimizes q + 1 variables. Similar to the importance 
sampling-based sample averaging approximation method in (1181) . we use an approximation 
of 





m=l 


nm-icx-’jip 






where Xj,..., X^ are i.i.d. from /(•, 6) with the Monte Carlo sample size M. The solution 
to 

max ip subject to S{((),ip) = 7 ( 21 ) 

4>&N{e) 

can be used to approximate that to (1201) . Note that ^(0, (p) may not equal 7 exactly in (I2T]) . 
In practice we handle an equivalent problem 


max ip subject to S{(f),(p) ^ 7 ( 22 ) 

4>eAr{6) 

instead of (I2T]) . A design-based method similar to (ITO]) can also be used to solve (1221) . Since 
(122|) has not straightforward solution even for a given 0 G M{9), we do not recommend such 
a method. A more simple and general method for computing LOCIs is to directly compute 
the quantiles of for a given 0. This method will be discussed in the next subsection. 


5.3 Neighborhood bootstrap 

This subsection discusses a general method, called neighborhood bootstrap, to implement 
LOT and LOCI. This method still works for the cases where the importance sampling-based 
approach in Section lA21 fails. We hrst consider LOT. Like the design-based p-value in (IT^ . 
take L try points 4>i,... ,4 >l uniformly spaced over MiJ)) fl Oq. The difference from (IT^ is 
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that the neighborhood bootstrap method directly approximates the objective value in ([7]) 
by the Monte Carlo method. Specifically, for each 0;, / = 0, 1 ,..., L, generate , X^*^ 

i.i.d. from F{-, (pi), where (po = 6. Then the p-value in ((71) can be approximated by 

f 1 ^ 

^nb = ^ max^ [TO^U > t) 

’ ’ t m=l 

For LOCI, we still consider the computation of upper limits in f[T2|) . With {0i,..., cPl} 
uniformly spaced over Af{9), take bootstrap sample X^*^,..., X^*^ i.i.d. from F{-,(pi) for 
/ = 0,1,..., L. Let denote the sample y-quantile of .^(0/)—^(X* J,... ,^{(pi)—^{Xi j^). 

Consequently, sup^g_yyy-(^g^ can be approximated by 

Neighborhood bootstrap is a very general method. In principle, it can be applied to 
infinite-dimensional parameter spaces if there are well-defined space-filling designs for such 
spaces. Another advantage of neighborhood bootstrap is its easy implement, especially for 
computing LOCIs. For LOT, neighborhood bootstrap is slightly more time-consuming than 
the importance sampling-based approach. 

5.4 Design of try points 

The design-based p-value in flT^ and the neighborhood bootstrap method in Section 15.31 
both need L try points <pi,... ,<pL uniformly spaced over J\f{9). This subsection presents 
some discussion on the design of these points. Usually Af{9) is selected as a g-dimensional 
hypercube like flT^ or flTbl) . Specifically, suppose that Af{9) = [Li, Ui] x • • • x [Lg, Uq\. For 
^pi = {ipii, -ipig)' e [0, Vf, z = 1,..., L, let (pij = Lj + tpijiUj - Lj), i = j = 

1, ... ,q, and we have <pi = {(pn, ..., (pig)' G Af{9) for z = 1,..., L. Therefore, it suffices to 
consider the design of'ipi,... ,'ipL itl [0,1]'^, called initial design in the following. As mentioned 
in Section [5^ the initial design can be constructed from space-filling designs in [0,1]^. Such 
designs include grids, Latin hypercube designs (McKay, Beckman, and Conover 1979), and 
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uniform designs (Fang et al. 2000), among others. A simple choice is the following grid 

1 217-n r 1 217-11 

where 17 is a positive integer. There are L = points in the grid, and this leads to 
unaffordable computations for large q. Another choice is the Latin hypercube design (LHD) 
(McKay, Beckman, and Conover 1979), which is easy to construct for any L and q. The LHD 
is spaced uniformly in each dimension, and its space-hlling properties over the whole [ 0 , 1 ]'^ 
can be improved by iterative algorithms (Park 2001). There are functions for generating 
LHDs in both MATLAB and R. 

Note that in fact we need to design 01 ,..., 0^ in MiO) fl ©o for LOT or in NiO) fl 0 for 
LOCI. For irregular or constrained parameter spaces, this problem becomes complicated. A 
feasible solution is to design more points in ^{9) and then to keep those in the intersection. 

6 Illustrative examples 

This section presents four examples to illustrate LOT and LOCI, in which the (asymp¬ 
totic) distributions of the test statistics or pivotal quantities are non-regular or unclear. 

6.1 Interval estimation for the maximum cell probability of the 
multinomial distribution 

Let {Xni, ... iXnk)' be the cell frequencies from a multinomial distribution, MNfc(n; 7 r), 
where = n, with the parameter n = (tti, ..., tta,)', tt* > 0 , i = and 

Yl^=i = 1- We consider interval estimation for TTmax = niax{ 7 ri,..., tt^}. This problem is 
related to some real applications including the diversity of ecological populations (Patil and 
Taillie 1979) and favorable numbers on a roulette wheel (Ethier 1982), and has been studied 
by Gelfand et al. (1992), Glaz and Sison (1999), and Xiong and Li (2009), among others. 

The maximum likelihood estimator (MLE) of n is {Xni/n, ..., Xnk/n)', and that of TTmax 
is maxi^iscfc Xni/n. To avoid extreme values in the estimators, we use the Bayesian estimator 
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% = (vTi,..., TTk)' = {{Xni + 1/2 )/{n + k/2),..., {Xnk + 1/2)/(n + A;/2))' from the Jeffrey- 
prior (Ghosh, Delampady, and Samanta 2007),. The corresponding estimator of vr^ax is 
TTmax = niaxi^j^fc TTj, whose asymptotic properties are the same as the MLE. Xiong and Li 
(2009) showed that, when the nnmbers in {* = 1,..., fc : tTj = TT max } are more than one, 
ilmax is not asymptotically normal and the corresponding bootstrap distribntion estimator is 
inconsistent. A remedy is to nse m-ont-of-n bootstrap (Bickel, Gotze, and van Zwet 1997). 
This method takes bootstrap sample {X ^^,..., X^^)' from MNfc(m; tt) with m = o(n), and 
then approximates the distribntion of \/n(7fmax —TTmax) by its bootstrap analogne \A«(7rmax — 
TT ma x), where tt^^^,^ = maxi,gj,gfc + i-/2)/(Tn + fc/2)}. Xiong and Li (2010) proved that 

this approximation is consistent, and thus results in asymptotically valid conhdence intervals 

for TTmax- 

The LOGI of TTmax cau be easily constructed by the neighborhood bootstrap method in 
([23]), where the pivotal quantity is TTmax — ftmax- We next conduct a simulation study to 
compare the LOGI with the ordinary bootstrap and m-out-of-n bootstrap methods. Here 
we focus on two-sided 1 — a conhdence intervals with a = 0.05. In our simulation study, 
k is hxed as 5, and n = 30 and 60 are considered. We use six vectors of cell probabilities; 
see Tabled In the m-out-of-n bootstrap method, m is set as the integer part of 2y/n. The 
neighborhood A/'(7r) is 

[fti - Jlog(n)/x/n, TTiJlog(n)/^/n] x---x [kk - 6\og{n)/kk + 6\og{n)/^/n] 

where two values, 0.1 and 0.5, of 5 are used, it is clear that N'{n) satishes Assumptions [D 
and[2j We use two grids in fl2T|) to design the try points with U = 3 for S = 0.1 and U = 5 
for 6 = 0.5. Note that there is a constraint Yli=i tt* = 1 in the parameter space. There are 
51 and 101 try points in the two grids, respectively. The bootstrap sample size is 5000 in all 
the above methods. 

We repeat 5000 times to compute the coverage rates (GRs), mean lengths (MLs), and 
standard deviations of lengths (SDLs) of the conhdence intervals. The simulation results are 
shown in Table [2l We can see that the bootstrap interval usually has low GR. For dispersed 
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Table 2: Simulation results in Section [6T] 

TT = (0.7, 0.075, 0.075, 0.075, 0.075)' 



CR 

n = 30 

ML 

SDL 

CR 

n = 60 

ML 

SDL 

Bootstrap 

0.927 

0.299 

0.023 

0.932 

0.222 

0.014 

Bootstrap (m < n) 

0.920 

0.389 

0.054 

0.945 

0.377 

0.037 

LOCI (<5 = 0.1) 

0.950 

0.325 

0.021 

0.940 

0.228 

0.013 

LOCI (<5 = 0.5) 

0.961 

0.345 

0.020 

0.954 

0.236 

0.012 


TT = (0.5,0.15,0.15,0.1,0.1)' 



CR 

n = 30 

ML 

SDL 

CR 

n = 60 

ML 

SDL 

Bootstrap 

0.846 

0.297 

0.040 

0.912 

0.237 

0.011 

Bootstrap (m < n) 

0.881 

0.368 

0.065 

0.955 

0.362 

0.038 

LOCI (5 = 0.1) 

0.897 

0.321 

0.033 

0.931 

0.244 

0.008 

LOCI (5 = 0.5) 

0.967 

0.350 

0.018 

0.967 

0.259 

0.008 


TT = (0.3, 0.175, 0.175, 0.175, 0.175)' 



CR 

n = 30 

ML 

SDL 

CR 

n = 60 

ML 

SDL 

Bootstrap 

0.738 

0.175 

0.075 

0.702 

0.147 

0.058 

Bootstrap (m < n) 

0.748 

0.187 

0.093 

0.722 

0.164 

0.090 

LOCI (<5 = 0.1) 

0.939 

0.210 

0.075 

0.832 

0.172 

0.054 

LOCI (<5 = 0.5) 

0.991 

0.296 

0.046 

0.990 

0.217 

0.032 


7r= (0.3, 0.3,0.2, 0.1, 0.1)' 



CR 

n = 30 

ML 

SDL 

CR 

n = 60 

ML 

SDL 

Bootstrap 

0.893 

0.207 

0.064 

0.909 

0.168 

0.035 

Bootstrap (m < n) 

0.913 

0.234 

0.081 

0.955 

0.210 

0.065 

LOCI (<5 = 0.1) 

0.954 

0.248 

0.064 

0.944 

0.205 

0.036 

LOCI (<5 = 0.5) 

0.979 

0.327 

0.036 

0.966 

0.241 

0.023 


TT = (0.24, 0.24, 0.24, 0.24, 0.04)' 



CR 

n = 30 

ML 

SDL 

CR 

n = 60 

ML 

SDL 

Bootstrap 

0.935 

0.170 

0.065 

0.924 

0.134 

0.041 

Bootstrap (m < n) 

0.956 

0.184 

0.074 

0.986 

0.149 

0.057 

LOCI (<5 = 0.1) 

0.943 

0.210 

0.064 

0.949 

0.174 

0.042 

LOCI (<5 = 0.5) 

0.976 

0.305 

0.032 

0.970 

0.220 

0.024 


7r= (0.2, 0.2, 0.2, 0.2, 0.2)' 



CR 

n = 30 

ML 

SDL 

CR 

n = 60 

ML 

SDL 

Bootstrap 

0.906 

0.135 

0.063 

0.946 

0.095 

0.049 

Bootstrap (m < n) 

0.982 

0.136 

0.074 

0.996 

0.088 

0.059 

LOCI (<5 = 0.1) 

0.950 

0.175 

0.064 

0.963 

0.127 

0.047 

LOCI (<5 = 0.5) 

0.937 

0.280 

0.038 

0.963 

0.195 

0.028 
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TT, the m-out-of-n bootstrap method lacks efficiency with longer ML, whereas two LOCIs 
perform better. As expected, the LOCI with S = 0.5 is more conservative than that with 
6 = 0.1. In summary, it can be concluded that the LOCI is at least comparable to the 
m-out-of-n bootstrap interval. 


6.2 Interval estimation for the location parameter of the three- 
parameter Weibull distribution 

The Weibull distribution is widely used in many helds such as survival analysis (Cox 
and Oakes 1984) and reliability (Murthy, Xie, and Jiang 2004). Let Xi,..., Xn be i.i.d. 
observations from the Weibull distribution Wbl(a,&, r), whose p.d.f. is 


f{x;a,b,T) = - f— 
a \ a 


T 


b-1 


exp 


X — T 


(25) 


for x > r, a > 0, 6 > 0, and r G M. The parameters a, b, and r are known as the 
scale, shape, and location parameters, respectively. If r is known, then the likelihood- 
based inference for the parameters is straightforward (Murthy, Xie, and Jiang 2004). With 
an unknown r, the standard method faces difficulties since the distributions have not a 
common support (Blischke 1974). Estimation for the parameters of the three-parameter 
Weibull distribution is still an active topic in recent years, and many estimators have been 
proposed; see Lockhart and Stephens (1994), Cousineau (2009), and Teimouri, Hoseini, and 
Nadarajah (2013), among others. Since the (asymptotic) distributions of these estimators 
are difficult to derive, there is limited results on interval estimation for the parameters. 

This subsection constructs LOCIs for r based on the maximum product of spacings 
(MPS) estimation (Cheng and Amin 1983). Obviously our method is also applicable for 
other parameters. The MPS estimators a, b, and f are constructed by maximizing 


S{a, 



f{x; a, b, T)dx, 
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Table 3: Simulation results in Section [6^ 

a = 0.5, b = 0.5 



CR 

n = 10 

ML SDL 

CR 

n = 20 

ML 

SDL 

Bootstrap 

LOCI 

0.970 

0.979 

0.112 0.075 

0.129 0.190 

0.945 

0.946 

0.023 

0.024 

0.017 

0.020 

a = 2.5, b = 0.5 


CR 

n = 10 

ML SDL 

CR 

n = 20 

ML 

SDL 

Bootstrap 

LOCI 

0.946 

0.946 

0.336 0.223 

0.343 0.234 

0.940 

0.940 

0.072 

0.073 

0.057 

0.060 

a = 0.5, b = 1.5 


CR 

n = 10 

ML SDL 

CR 

n = 20 

ML 

SDL 

Bootstrap 

LOCI 

0.410 

0.949 

0.072 0.042 

1.013 0.546 

0.230 

0.982 

0.024 

0.789 

0.012 

0.316 

a = 2.5, b = 1.5 


CR 

n = 10 

ML SDL 

CR 

n = 20 

ML 

SDL 

Bootstrap 

LOCI 

0.210 

0.876 

0.240 0.133 

1.081 0.644 

0.098 

0.921 

0.055 

1.166 

0.029 

0.542 

a = 0.5, b = 2.5 


CR 

n = 10 

ML SDL 

CR 

n = 20 

ML 

SDL 

Bootstrap 

LOCI 

0.289 

0.958 

0.104 0.042 

1.454 0.472 

0.121 

0.972 

0.038 

1.246 

0.014 

0.189 

a = 2.5, b = 2.5 


CR 

n = 10 

ML SDL 

CR 

n = 20 

ML 

SDL 

Bootstrap 

LOCI 

0.014 

0.881 

0.221 0.068 

1.392 0.671 

0.010 

0.950 

0.038 

1.614 

0.020 

0.379 
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where X(i) ^ ^ X(„) are order statistics, X(o) = r, and X(„+i) = oo. For all a, b, and r, 

the MPS estimators are consistent (Cheng and Amin 1983). Furthermore, for b > 2, they 
have the same asymptotic distributions as the MLEs; if 0 < 6 < 2, then a — a = Op{l/^/n), 
b — b = Op{l/y/n), and f — t = Op{l/n^^^). It is not straightforward to construct conhdence 
intervals of r by the asymptotic properties of f since b is unknown. Furthermore, the validity 
of the corresponding bootstrap conhdence interval is unclear. 

We use neighborhood bootstrap to construct two-sided 1 — a conhdence intervals of r, 
and conduct a simulation study to evaluate their performance. The pivotal quantity is r —f. 
The initial design is the grid in fl24)) with U = 3 that corresponds to L = 27. Since the 
results are sensitive to the value of b, we set the neighborhood Af{a, b, f) as 

[a - 5n, a + 5n] x [6 - <5^, 6-1- 6n] x [f - 6n, f + 6n ], 

where = 4 exp ( — (1/6)^) \og{n)/y/n. It is clear that M{a, b,f) satishes Assumptions [H 
and [2] for all a, b, and r by the asymptotic properties of the MPSs. For r = 1, two values 
of n, and several combinations of (a, b), the simulation results based on 1000 repetitions are 
reported in Table [3] with a = 0.05. The bootstrap sample sizes used in the bootstrap interval 
and LOCI are both 1000. We can see that, for b = 0.5, the CR of the bootstrap interval is 
satisfactory, and the LOCI has similar performance to it with slightly longer ML. For larger 
b, the bootstrap interval performs poorly, and the LOCI is much better in terms of CR. 

6.3 Testing whether all the coefficients in the high-dimensional 
regression are nonnegative 

High-dimensional data analysis that deals with models where the number of parameters 
is larger than the sample size is a very active research area in recent years. We consider the 
regression model 

y = Xl] + e, (26) 

where X = (xij) is the n x p regression matrix, y = (|/i,... ,?/n)' is the response vector. 


20 


{3 = (/3i,..., /3p)' is the vector of regression coefficients and e = (^i,..., Sn)' is a vector of 
i.i.d. normal random errors with zero mean and finite variance Let po denote the number 
in {j = 1,... ,p : 7 ^ 0}. For p 3> n, we make the sparsity assumption of po n. Many 

methods have been proposed to estimate the sparse (3 in fl26|l such as the lasso (Tibshirani 
1996), the smoothly clipped absolute deviation method (Fan and Li 2001), and the minimax 
concave penalty method (Zhang 2010). Under the assumption that all the coefficients are 
known to be nonnegative, Efron et ah (2004) introduced a nonnegative lasso method to 
estimate /3, which solves 

p 

mm\\y — X(3\\‘^ + X'^^(3j subject to^ 0, j = 1 ,... ,p, (27) 

^ i=i 

where A > 0 is a tuning parameter. Applications of this method can be found in Frank 
and Reiser (2006) and Wu, Yang, and Liu (2014). In this subsection we use the data to 
test whether the assumption in the nonnegative lasso method is reasonable, i.e., test the 
following hypotheses 


Hq : I3j ^ 0, j = 1,... ,p -H- Hi : Hq does not hold. (28) 

In classical n > p settings, the problem to test fl25]) has been discussed by the likelihood 
ratio test; see Silvapulle and Sen (2011). However, this method cannot be dirrectly extended 
to the high-dimensional case since the MLEs perform very poorly for such a case. Here 
we borrow the idea of the generalized likelihood ratio test in nonparametric statistics (Fan, 
Zhang, and Zhang 2001), and construct the test statistic 

j, ^ \\y - 

~ Wy-X^H^ 

where /3ho and are the estimators of (3 under Hq and Hi, respectively. A natural choice 
is to use the nonnegative lasso estimator in and the lasso estimator as (3ho and 
respectively. Since the distribution of T under Hq is unclear, we use LOT to test fl28|) . 
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First of all we need to estimate all the unknown parameters under Hq. Wu, Yang, and 
Liu (2014) showed that the nonnegative lasso estimator in fl271) is consistent under Hq. By 
Fan, Guo, and Hao (2012), a consistent estimator of cx^ is = \\y — XPisW'^/n, where (3ls 
is the ordinary least squares estimator of [3 under the submodel selected by the nonnegative 
lasso. Since p is large, the neighborhood Af{/3Ho, d^) should be selected elaborately to avoid 
high-dimensional optimization. We select Ar{j3Ho,d'^) as 

X • • • X N'0Ho,p) X A/'(d^), (29) 

where are components of ^Ho, = { 0 } for PhqJ = 0 and M0Ho,j) = [^HqJ - 

Sa, /3Ho,j+Sa~\ otherwise, A/'(d^) = [d^—5, d^-|-5], and 5 > 0 is a constant. By the importance 
sampling-based approach in Section [5^ the p-value of the LOT for fl28|) is given by flTT)) . 
Note that the asymptotic results in Section 0] cannot be dirrectly applied for diverging p. 
However, it is not hard to show that, if Hq holds, then Pr((/5,(T^) G A/'(/5//p, d^)) ^ 1 as 
u —)■ cxD under regularity conditions by selection consistency properties of the nonnegative 
lasso (Wu, Yang, and Liu 2014). Therefore, similar to Theorem [H the asymptotic frequentist 
property of the LOT can be guaranteed. 

We conduct a simulation study to compare the above LOT and the bootstrap test whose 
p-value is given in ([8]). In the simulation the rows of X in fl26|) are i.i.d. from a multivariate 
normal distribution Y(0,E) whose covariance matrix S = {(Tij)pxp has entries an = 1, i = 
1,... ,p and atj = 0.1, i ^ j. The random errors £ 1 ,...,£„ i.i.d. ~ Y(0,1). We use three 
conhgurations of n and p, (n,p) = (20,40), (n,p) = (40,80), and (n,p) = (60,120). We 
take the tuning parameter A = 4:y^\og{p)/n in the lasso and nonnegative lasso estimator 
recommended by Wu, Yang, and Liu (2014). In the LOT, 6 is set as 0.03 in fl2^ . and we 
compute the p-value in flT^ with 30 try points. Here the initial design of the try points is 
an LHD, whose dimension is the number of non-zero see ([29]). In the two methods, 

the bootstrap sample sizes are both 2000. The signihcance levels a = 0.05 and a = 0.1 are 
considered. 

Four vectors of the coefficients under Hq are used: (i) (3i = ■ ■ ■ = (3p = 0; (ii) /3i = 2 and 
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Table 4: Type I errors in Section 16.31 


n = 20, p = 40 



(i) 

OL = 

(ii) 

0.05 

(iii) 

(iv) 

(i) 

OL = 

(ii) 

0.1 

(iii) 

(iv) 

Bootstrap 

0.084 

0.094 

0.096 

0.092 

0.176 

0.199 

0.186 

0.192 

LOT 

0.048 

0.056 

0.056 

0.050 

0.099 

0.119 

0.108 

0.122 




n = 40, 

p = 80 







OL = 

0.05 



a = 

0.1 



(i) 

(ii) 

(iii) 

(iv) 

(i) 

(ii) 

(iii) 

(iv) 

Bootstrap 

0.134 

0.172 

0.160 

0.168 

0.248 

0.302 

0.282 

0.308 

LOT 

0.052 

0.062 

0.062 

0.060 

0.110 

0.126 

0.116 

0.126 




n = 60, 

p = 120 







OL = 

0.05 



a = 

0.1 



(i) 

(ii) 

(iii) 

(iv) 

(i) 

(ii) 

(iii) 

(iv) 

Bootstrap 

0.206 

0.216 

0.194 

0.239 

0.372 

0.378 

0.362 

0.376 

LOT 

0.066 

0.054 

0.060 

0.051 

0.128 

0.126 

0.100 

0.136 



Figure 1: Powers of the LOT in Section [631 {n = 40,p = 80). 
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I3j = 0 for other j; (iii) jSi = (52 = 2 and (3j = 0 for other j; (iv) I3i = (32 = (3^ = 2 and (3j = 0 
for other j. To compnte the power, we consider (3i = 2, /52 = c < 0, and (3j = 0 for other 
j. For each model, we simnlate 2000 data sets, and report the Type I errors and powers in 
Table 0] and Fignre [1], respectively. It can be seen that the bootstrap test cannot control 
Type I error well, and that the LOT has reasonable performance in terms of Type I error and 
power. The power performance of the LOT is similar for other parameter conhgnrations. 

6.4 Interval estimation for the minimum of an unknown function 

Consider the nonparametric regression model 

y = r{x) + e, (30) 

where r is a continnous fnnction dehned on [0,1] and e ~ A^(0, cr^) is the random error. 
For given xi,...,Xn G [0,1], the responses are denoted by |/i,...,|/n, respectively, where 
yi = r{xi) + Ei and Ei,... ,en are independent. Assnme that r has a nniqne minimnm ^ 
in [0,1], i.e., r(^) < r{x) for all x G [0,1] with x ^ We are interested in constrncting 
conhdence intervals of ^ = ^(r). Withont the random error, some related problems have 
been discnssed in the literatnre; see de Haan (1981) and de Carvalho (2011), among others. 
However, to the best of the anthor’s knowledge, there is no result on interval estimation for 
^ in the regression setting. 

In model (jHilil . the unknown parameter r lies in an inhnite-dimensional space. We shall 
show that, with a hxed design for xi,..., Xn, the problem of construction conhdence intervals 
for ^ can be simplihed to a hnite-dimensional problem, and thus can be solved by the 
approaches in Section O Here we only focus on the upper 1 — a conhdence interval for 
a G (0,1). Let f and a be estimators of r and a. We use ^(X) = argmin 3 ;g[o,i] f{x) as an 
estimator of ^ with X = (j/i,..., ?/„)', and consider the pivotal quantity ^{r) — ^(X) with the 
c.d.f. = Pr(,^(r) — |(X) ^ x). For a G [0,1], 6 i,... G M, and c > 0, let X* = 

{yl ,..., J/*) denote the set of independent random variables y* ~ N{bi, c^) for i = 1 ,..., n. 
Let 9 = {a,bi,... ,bn, c)' and He{x) = Pr(a — ^(X*) ^ x). Denote by Mni^) C a 
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neighborhood of 9 = {i,f{xi),...,f{xn),d-y. Since H(r,a){x) = the 

following proposition is straightforward. 



Figure 2: Four regression functions in simulations in Section [631 
Proposition 3. For all r and a, if Pr((^, r{xi)^ ..., r{xn)i ^ J^n{9)) 1, then 

lim inf Pr (^ I + sup .^<7^(1 — a) ) ^ 1 — a. 

V <t>eMn(9) J 

By Proposition |3l we can obtain LOCIs of f which have the asymptotic frenquentist 
properties through optimizing the quantiles of over a local region. In the following, let 
f be the Nadaraya-Watson estimator with kernel function K and bandwidth h (Hart 1997). 
Under regularity conditions, sup 3 ,g[o,i] \r{x) —r{x)\ —)■ 0 in probability (Hardle and Luckhaus 
1984), which implies that is a consistent estimator of f. Additionally, a consistent estimator 
(T^ of can be given from the residual sum of squares of f. A choice of J\fn{9) satisfying the 
condition in Proposition [3] is 

—(5(j, X [r(a;i)—(5(j, f{xi) + 6a]x- ■ ■x[r{xn) — Sa, f{xn) + Sd']x[a — 6, (31) 

where 5 > 0 is a constant. 

We next conduct a simulation study to compare the bootstrap two-sided 1 — a conhdence 
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Table 5: Simulation results in Section [6^ 


(I) 



CR 

n = 20 

ML 

SDL 

CR 

n = 30 

ML 

SDL 

Bootstrap 

0.635 

0.264 

0.066 

0.643 

0.243 

0.060 

LOCI 

0.933 

0.487 

0.062 

0.948 

0.457 

0.055 

(11) 



n = 20 



n = 30 



CR 

ML 

SDL 

CR 

ML 

SDL 

Bootstrap 

0.749 

0.189 

0.176 

0.766 

0.161 

0.159 

LOCI 

0.935 

0.281 

0.220 

0.950 

0.251 

0.199 

(III) 



n = 20 



n = 30 



CR 

ML 

SDL 

CR 

ML 

SDL 

Bootstrap 

0.710 

0.272 

0.091 

0.656 

0.233 

0.069 

LOCI 

0.973 

0.573 

0.127 

0.963 

0.490 

0.095 

(IV) 



n = 20 



II 

CO 

O 



CR 

ML 

SDL 

CR 

ML 

SDL 

Bootstrap 

0.721 

0.486 

0.203 

0.721 

0.446 

0.175 

LOCI 

0.927 

0.794 

0.192 

0.957 

0.787 

0.161 


intervals and LOCIs with a = 0.05. Four regression functions in (l30|) are considered: 

(I): r(a;) = 2(2a; — 1)^; (II): r(a;) = 2/(a; + 1); 

(III): r(x) = sin(27ra: + 37r/4)/2; (IV): r(a;) = |x — 1/2|; 

see Figure [21 For these functions, the values of ^ are 1/2, 1, 3/8, and 1/2, respectively. 
We £x cr^ = 1/4, and Xi = (2z — l)/(2n) for i = l,...,n. The kernel function K in 
r is the Epanechnikov kernel, and the bandwidth h is set as jh. In LOCIs, we use 
5 = 0.25 in (l3T|) . and take 60-run LHDs as the initial designs of try points for implementing 
neighborhood bootstrap. The bootstrap sample size is 5000. Based on 5000 repetitions, we 
report the simulation results in Table [5l It can be seen that the bootstrap method performs 
poorly in terms of CR, and that the LOCI is much better for all the cases. 
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7 Discussion 


In this paper we have introduced local optimization-based inference including LOT and 
LOCI. The main advantage of our approach is that, unlike current frequentist approach, 
it does not require hard work in deriving (asymptotic) distributions since its asymptotic 
frequentist properties hold as long as we have consistent estimators of the unknown parame¬ 
ters. The implementation of our approach is based on standard computational methods such 
as importance sampling and Monte Carlo, which are easy to master for practitioners. Lo¬ 
cal optimization-based inference can be viewed as an extended bootstrap that complements 
current frequentist inference. It can fast provide frequentist solutions to complex problems 
in practice, and has broadly potential applications. Illustrative examples have shown these 
to some extent. Although local optimization-based inference does not overshoot for reg¬ 
ular problems (see Theorem [2]), it is more suitable for non-regular problems in which the 
theoretical derivation is difficult. 

We give a further discussion on the specihcation of the neighborhood ^{9) here. Gener¬ 
ally speaking, the choice of is flexible; see Section |6l In real applications, for a dataset 
with hxed sample size n, it is not hard to hnd a proper ^{9) that guarantees that LOT or 
LOCI has satisfactory performance via empirical evaluations. Besides the methods in Sec¬ 
tion [ATI we can also use informative priors, if any, to inform the construction of M{9). This 
provides a way to associate our approach with Bayesian statistics, and is valuable to study 
in the future. A related problem to the specihcation of ^{9) is that it is difficult to get the 
exact solution or to know how close an approximate one to it even for a small J\f{9). This 
problem is not very serious in practice since our terminal is inference instead of optimization. 
Simulation results in Section [6] show that the design-based approximation with a moderate 
L yields satisfactory hnite-sample performance of LOT and LOCI even for high-dimensional 
Af{9). In fact, when bootstrap gives aggressive results, local optimization-based inference can 
always improve its performance, even with a relatively poor optimization algorithm, since 
the corresponding optimization problem possesses a better solution principle (Xiong 2014): 
a better approximation to the exact solution yields less Type I error or higher coverage rate. 
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A disadvantage of local optimization-based inference is its computational cost. This 
can be viewed as the price of generality. We can replace the Monte Carlo method in the 
implementation with LHD sampling or quasi Monte Carlo to improve the computational 
efficiency (Homem-de-Mello 2008). Iterative algorithms such as stochastic approximation 
(Kushner and Yin 1997) are also available to solve the stochastic programming problem in 
O- Another future topic is to apply the proposed approach to general infinite-dimensional 
problems, which call for infinite-dimensional optimization techniques. In the neighborhood 
bootstrap method, we need to develop new space-filling designs in infinite-dimensional spaces. 


Appendix: Asymptotic properties of the design-based 
algorithm 

As mentioned in Section |5l max;=i^...^i^ ~ approximate the upper 

limit — a) in LOCI, where {0i,..., (j)L„} is a dense subset of Afn{0). We 

next prove frequentist properties of this approximation. These results are less important 
in practice since we can obtain an approximation as accurate as possible with a powerful 
computer. We place them here because they may be still of interest in theory. 

Assumption 5. Let {a„} be a series of positive numbers. Denote H^{x) = Pr(a„[(^(0) — 
|(X*)] ^ a:| X), where X* is drawn from F{-, 0) given X. As n ^ oo, 

max min \H7\1 - a) - St\ 1 - a) \ = Op(l). 

Assumption 6. As n —)■ cxd and 5 —)■ 0, — a) -5)^ 1 — 0. 

Note that the limits of and can be different for ipi, (p 2 & Afn{6)- Assumption 
|5] requires that {0i,..., should be dense enough so that any value of — a) can 

be approximated accurately by some element in — a)}i=i,...,L„- Assumption [5] holds 

under Assumptions |2]1U and relates to some space-filling criterion (Johnson, Moore, and 


Ylvisaker 1990). Assumption [6] says that He is asymptotically continuous at H^^il — a). 
Under Assumption HI (i), Assumption [6] holds. 

Theorems. Under Assumptions\^\^ and\^ 


liminf Pr ( ^ ^ ^ + max — «) ) ^ 1 — a. 

1=1^...,Ln 


Proof. For any n, there exists 6** G Afn{9) such that sup^g_;^^(g) ^(1—a) < Q!) + l/n. 

Denote I* = arg min,=i_..._ 2 .„ -a)- HfX{l - a)|. We have maxz=i_..._i„ ^*^(1 “ ^ 

-a)^ HiXil -a)- -a)- HfXil - a)|. Therefore, 




> 


Pr (a„K - i) < - a) - |J?^-;(1 -a)- H,-pl - a)|) 


^Pr|a„(^-0^ sup - a) + Op(l) - 1/ 

4>eJ^n{e) 

^ Pr (a„(e - I) ^ Hf\l -a) + 0^(1)) - Pr G A4(0)) 
= 1 — a + o(l) — Pr ^6* G Mn{9)^ —?■ 1 — a, 


which completes the proof. □ 

The following theorem is straightforward. 

Theorem 4. Under As sumptions for all 9 E Q and a G (0,1), 

lim Pr ( ^ ^ ^ + max Ffr^(l — a) ] = 1 — a. 

n—>00 \ 1 = 1 ,..., L „ ^ / 
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